Probabilistic Models and Motif Discovery Biosequences

نویسندگان

  • Richard M. Karp
  • Trey Ideker
چکیده

The goal for this section of the course is to characterize families of related DNA or protein sequences. We may be interested in characterizing sequence surrounding binding sites of particular proteins, promoter regions, the location of genes, exons, or splice junctions in DNA sequence, families of repeated sequences, or families of proteins with similar function. Methods presented in this section of the course are in contrast to the approaches Larry Ruzzo has taken in his past few lectures. Larry discussed methods for sequence searching and matching. These methods included:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Summary Report of My Scientific Activities during ERCIM Postdoc fellowship at NTNU

Motif discovery is a crucial part of regulatory network identification, and therefore widely studied in the literature. Motif discovery programs search for statistically significant, well-conserved and over-represented patterns in given promoter sequences. When gene expression data is available, there are mainly three paradigms for motif discovery; clusterfirst, regression, and joint probabilis...

متن کامل

Knowledge Discovery in Biosequences Using Sort Regular Patterns

This paper considers knowledge discovery by sort regular patterns, which are strings over sort letters representing nite sets of basic letters. We devise a learning algorithm for the class based on the minimal multiple generalization technique, and evaluate the method by experiments on biosequences from GenBank database. The experiments show that relatively a simple sort pattern can represent a...

متن کامل

Genetic Algorithm Based Probabilistic Motif Discovery in Unaligned Biological Sequences

Finding motif in biosequences is the most important primitive operation in computational biology. There are many computational requirements for a motif discovery algorithm such as computer memory space requirement and computational complexity. To overcome the complexity of motif discovery, we propose an alternative solution integrating genetic algorithm and Fuzzy Art machine learning approaches...

متن کامل

Efficient exact motif discovery

MOTIVATION The motif discovery problem consists of finding over-represented patterns in a collection of biosequences. It is one of the classical sequence analysis problems, but still has not been satisfactorily solved in an exact and efficient manner. This is partly due to the large number of possibilities of defining the motif search space and the notion of over-representation. Even for well-d...

متن کامل

A sequential method for discovering probabilistic motifs in proteins.

OBJECTIVES This paper proposes a greedy algorithm for learning a mixture of motifs model through likelihood maximization, in order to discover common substrings, known as motifs, from a given collection of related biosequences. METHODS The approach sequentially adds a new motif component to a mixture model by performing a combined scheme of global and local search for appropriately initializi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998